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(57) A hypertaart document and anchw sentences of 
parent documents for the hypertad document are regis- 
tered with an hypertext document identifier as docu- 
ment information for each of hypertext documents 
having reference relationships with each other. A user 
can refer to, one hypertext document according to an 
anchor sentence of another hypertext document func- 
tioning as a parent document. Also, occurrence posi- 
tions of one word in hwaertext documents and parent 
documents are registered as word information for each 
of words. When a keyword is, input, a plurality of partk:- 
ular hypertext documents and particular parent docu- 
ments in which the keyword appears are specffied 
according to the word information, one particular hyper- 
, teid document and corresponding particular parent doc- 
uments are unified to a unified hypertext document for 
each particuiar hjpertext document, ah Wiburrence fre- 
■ quencjf of the keyword in each unifi&l Hypertext docu- 
ment is cateulafed according to the document 
information, importance degrees of the unified hj^ext 
documents are calculated as those of the particular 
hwaertext documents according to the occurrence fre- 
quencies, and ranking of the partiojlar hyperlW docu- 
ments are determinad according to those importance 
degrees. Because the occurrence frequency is calcu- 
lated by considering the parent documents, the particu- 
lar hypertext documents can be appropriately ranked. 
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Description 

RginKOROUNP nP Tl^F INVENTION 

1 FIELD OF THE INVENTION: ^ 

The present invention r^ales gener^ly to a hyper- 
text document retrieving appfflBtus, and more particu- 
larly to a hypertext document retrieving apparatus in 
which a plurality of hypertext documents likely to meet a 10 
user's rebieval request are retrieved from a large vol- 
ume of hypertejS documents and are presented to the 
user. 

2.DESCRIPT10N OF THE RELATED AFTT; « 

2.1 . PREVIOUSLY PROPOSED ART: 

As a conventional apparatus In wWch one or more 
documents likely to meet a user's retriwal request are 20 
retrieved from a large volume of docun«nts and are 
presented to the user, a document retrieving apparatus 
200 shown in Fig. 1 is known. In this apparatus 200, a 
large volume of documents stwed in a document man- 
aging urrt 201 are analyzed in advance in a retrieval as 
index developing unit 202, and it is examined how mav 
times each of a plurality of words registered in a diction- 
ary of the retrieval index dsveioping unit 202 appears in 
each of the documents. That is, an occurrence fre- 
quency of each word in one document is calculated for 30 
each of the documents stored in the document manag- 
ing unit 201 , a deviation degree IDF of one virord in the 
total documents is calculated as a con-ecfion factor for 
the word for each of the words, a normalized occurrence 
frequency (called a TF value) of each word is calculated 35 
for each of the documents, an estimated value of each 
document ej^ressed by TF*IDF is calculated for each of 
the words by multiplying the deviation degree and the 
normalized occurrence frequency together, and a 
retrieval index is davelq3ed in the retrieval index devel- 4t 
oping unit 202. In the retrieval index, a set of one word. 
identHicalion data indicafing one or more documents in 
which the woid appears and one estimated value for the 
word is registered for each of the words. 

Thereafter, when a plurality of keywords input by a * 
user 207 are received in a keywrord input unit 203, the 
keywords are transmitted to a retrieving urat 204. In ttie 
retrieving unit 204, a plurality of retrieval words agrwing 
with the input keyvwjrds are found out from the retrieval 
index stored in the retrieval widen d€weloping unit 202. a fi 
particulffl- set of one retrieval word, identification data 
indicating one <x more retrieval documents in which the 
retrieval word appears and one estimated value for the 
retrieval word is taken out for each of the retrieval words 
from the retrieval index developing unit 202, and me i 
particular sets corresponding to the keywords are trans- 
mitted to a document ranWng determining urat 205. 

In the document ranking determining unit 205, a 
plurality of idenliftoation titles indicating the retrieval 
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documents are arranged in decreasing order of the esti- 
mated values of the retrieval documents to determine 
the ranking of ttte retrievat documents, and the identifi- 
cation titles anranged according to the ranking of the 
retrieval documents are di^ayed as a retrieval result in 
a retrieval result displaying unit 206. Thereafter, when 
ttie user selects the Identification titles displayed on the 
displaying unit 206 one after another in the arranged 
order, the retrieval document indicated by the selected 
identifkation title is read out from the document manag- 
ing unit 201 to the displainng unit 206 each time one 
idwtif icatkjn title is selected, and the retrieval document 
is disfrfayed on the retrieval result displaying unit 206 
eadi time one identifk;ation title is selected. 

Therefore, because the keywords according to a 
user's retrieval request are input by the user, a plurality 
of documents likely to meet the user's retrieval request 
can be presented in the order of the estimated value 
TF-IDF 

A plurality of calculation methods of the estimated 
v^ue TF'IDF are known. As an exanple of one calcula- 
tion method, the dewatton degree IDF (= 1- tog Nw/N) 
obtained by subtracting a logarithmic value (tog Nw/N) 
of the ratio from 1 is defined. Here, the symbol Nw 
denotes the number of documents in which a remarked 
word aji^ears, and the symbol N denotes the number of 
documents stored in the document managing unit 201. 
Also, the normalized occun-ence frequency 
TF (=Fo/Nwd) obtained by dividing an oocun-ence fre- 
quency Fo of the remarited word in a remarked docu- 
ment by the number Nwd of words appearing In the 
remarked dorajmenl is defined. In this case, the esti- 
mated value TP*IDF is calculated by multiplying the 
deviation degree and the normalized occurrence fre- 
• quency together. 

The detail of the estimated value TF*IDF and a con- 
venttonal document retrieving apparatus in which ttie 
estimated value TF'IDF is used are disclosed in a liter- 
ature "Saltan, Gerard: Introduction to moda-n Inftwma- 
j fion Retrieval, IWIcGraw-Hill computer scistce series, 
1983). 

2.2. PROBLEMS TO BE SOLVED BY THE INVEN- 
TION: 

5 

Htwever. in cases where one or more particular 
hypertext documents likely to meet a user's retrieval 
recpjest are retrieved from a large volume of hypertext 
documents by using the conventional document retriev- 
0 ing apparatus, because the hypertext documents are 
not generally independent from each other but the 
hypertext documents often have reference relationships 
with each other, there is a drawback that the ranking of 
the paniciflar hypertext documents likely fo meet flie 
B user's retrieval request cannot be appropriately deter- 
mined. That is, bec^e contents of a plurality of partic- 
ular hypertext documents hawng a referential 
relationship with each ottier are often connected with a 
consistent meaning, the contents of the particular 



2 



hypertext documents cannot be understood by reading 
only one particular hypertext document but be under- 
stood by reading all of the particular hypertext docu- 
ments. Therefore, in cases where one or more 
particular hypertext documants litefy to meet a user's 
retrieval request are retrieved by using the coiwentionai 
document retrieving apparatus, an importance de^ee 
of each particular hypertext document is erroneously 
estimated, so that there is a drawback that the ranking 
of the particular hypertext documents cannot be appro- 
priately determined. AIsq. even though the particular 
hypertext documents ranked according to their esti- 
mated values are displayed. because the ranking of the 
particular hypertext documents fs not appropriately 
determined, there is another drawback thai the user 
cannot smoothly select the particular hypertext docu- 
ments in an appropriate importance degree order. 

In particular, because a possibility that a plurality of 
hypertext documents written in a hypertext mark-up lan- 
guage (HTML) in a world wide web have a referential 
relationship with each other Is considerably high, the 
ranking of the particular hypertext documents cannot be 
appropriately deitermined, and the user cannot 
smoothly select each of the particiilar hypertert docu- 
ments even though the particular hypertext documents 
ranked according to their estimated values are dis- 
played, ■ 



SUMMARY OF THE IMVFWTinK| 



An object of the presentinvention is to providei with 
due consideration to the drawbacks of «uch a conren- 
tional document retrieving apparatus, a hypertext docu- 
ment retriwing apparatus in which one or more 
hypertext documents likely to meet a user's retrieval 
request are retrieved from a large volume of hypertext 
documents and are appropriately ranked according to 
their importance degrees lo smootWy select each of the 
hypertext documents even though the hypertext docu- 
ments are written in the hypertext mark-up tenguage in 
the world wide web. , 

. To achieve the object of the present invention, in a 
hypertext document retrieving apparatus, a plurality of 
particular hypertext documents likely to meet a user s 
retrieval request are retrieved from a group of hypertext 
documents having reference, r^tionships wHh. each 
- other in which one hwaerlext document having an 
anchor sentence functions as a parent document for 
another hypertext document functioning as a reference 
document and a user refers to one reference document 
after the user selects one anchor sentence of one par- 
«Tf document.ccuresponcfing to the reference docu- 
ment. 

fn detail. In hj^ertexf document table preparing 
means, hypertext document information, in which one 
hypertext document identifier identifying one hypertext 
document, a body of the hypertext document, a parent 
document identifier identifying a parent ddcumeit con'e- 
sponcfing to the hypatext document functioning as one 



reference document and an anchor saif ence of the par- 
ent document are registered, is prepared for each of the 
hypertext documents, and a hypertext document taWe 
of the hypertext document information for all hypertext 
documents is prepared in advance. 

Thereafter, in retriwat index preparing means, a 
plurality of words appear&ig in each of the hypertext 
documents and the parent documents are recognized 
according to the hypertext document table prepared by 
10 the hypertext document table preparing means, a plural- 
ity of occurrence positons of the words in each of the 
hypertext documents and the parent documents are 
recognized according to the hypertext document table, 
word information, composed of one or more occurrence 
IS document identifiers identifying one or more hypertext 
documents in which one word ^ears and occun-ence 
positions of «ie word in the hypertext documents and 
one or more anchor sentences of one or more parent 
documents corresponding to the hypertext documwrts 
30 IS prepared for each of the words, and a retrieval inde)i 
of pieces of word information for the words is prepared 
in advance. 

Thereafter, when a keyword indicating the user's 
retrieval request is received in keyword receiving 
S5 means, partfcular wonj infonnation con'esponding to the 
keyword: is retrieved, in. retrieving means from the 
retrieval index prepared by the retrieiral index preparing 
means. Also, a plurality of particular occurrence docu- 
ment identifiers identifying a plurality of particular hyper- 
30 text documents in which the keyword appears and a 
plurality of particular occurrence positions of the key- 
word in the particular hypertext documents and one or 
more particufar anchor sentences of one or more partic- 
ular parent documents corre^onding to the oarficular 
35 hypertext documents are retrieved Irom the particular 
word information. 

Thereafter, , in document ranking determining 
means, the particular hypertext documents identified by 
the particular occurrence document identifiers are spec- 
-w ified, pieces of particufar. hypertert document informa- 
tion for the particular hypertext documents are retrieved 
from the hypertext document taMe prepared by the 
hypertext document t^te preparing means, one partic- 
ular hypertext document and one or more particular par- 
45 ent.. documents corresponding to. the particular 
tWertext document are unified to a unified hypertext 
document for each of the particular hypertext docu- 
mems, an occurrence frequency of the keyword in one 
unified hypertext'dtocument is calculated for each uni- 
so fied hypertffict document, a plurality of -importance 
degrees of the unified hypertext, documents are deter- 
mined according to the occurrence frequencies in the 
unified hypertext documents, one importance degree of 
one unified hypertext document Is set as an importance 
55 degree of one particular hypertext document corre- 
sponding to the unlfieJ hypertext document for each 
uniiied hypertext document, and Vne ranking of the par- 
ticular hypertext documents is determined according to 
the importance degrees of the unified hypertext docu- 
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Thereafter, a pluraRty oi indexes of the particular 
hypertext documents are displayed by retrieval result 
displaying means in a ranked order corresponding to 
the raniong of the particular hypertext documente as a 5 
retrieval result. 

Because one unified hypertext daciflnent is pre- 
pared by unifying one particular hypertext document 
and one or more particular ps^Hit documents corre- 
sponding to the particular hypertext document for each m 
of the particular hypertext documents and one impor- 
tance degree of one unified hypertext document is cal- 
culated as one importance degree of one particular 
hypertext document corresponding to the unified tiyper- 
text document for each of the unified hypertext docu- )s 
ments, the ranldng of ttie particular hypertext 
documents can be determined by considering the par- 
ticular parent documents having the reference relation- 
ships with the particular hypertext documents. 
Therefore, even though contents of a plurality of specific so 
hypertext documents having a referential relationship 
with each other we connected with a consistent mean- 
ing, the specific hypertext documwits likely to meet the 
user's retrieral request can be oorrecUy retriewed from a 
large volume of hypetext documents and be ^propri- as 
ately ranked according to their importance degrees, so 
that the user can smoothly s^ect the specific hypertext 
documents in an appropriate importance degree order 
even though the specific hypertext documents are writ- 
ten in ttie hypertext mark-up language in the wwld wide so 



The objects, features and advantages of the 35 
present invention will be apparent from the following 
description taken in conjunction with the accompanying 
drawings, in which: 

Fig. 1 is a block diagram of a convaitional docu- « 
ment retrieving apparatus; 
Fig. 2 shows a reference relattonsWp among a plu- 
rality of hypertext documents distribulively man- 
aged in a world wide web of an internet: 
Fig. 3 is a block diagram of a hypertext retrieving 4S 
apparatus according to a first embodiment of the 
present invention; 

Fig. 4 Shows a hypertext document table of paeces 
of hypertext dooumefit Informaton fwepared In a 
hypertext dooument table with parent document list so 
preparing unit shown in Fig. 3; 
Fig. 5 shows a retrieval index of pieces of word 
information prepared in a retrieval index preparing 
unit shown in Fig. 3; 

Fig. 6 is a block diagram of a hypertext retrieving ss 
apparatus according to a secarid embodiment of 
the present invention; 

Fig. 7 shows sxi example of a retrieval result in 
which an index of one particular hypertext docu- 



ment is cisplayed with an index of a first-stage par- 
ticular parent document and an index of a second- 
stage particular parent document f&r each particu- 
lar hypertext document by a retriwal result display- 
ing unit shown in Fig. 6; 

Rg. 8 is a tflock diagram of a hypertext retriewing 
apparatus according tc a third embodiment of the 
present invention; 

Fig. 9 shows an example of a retrieval result in 
which indexes of a plurality of particular hypertext 
documents are displayed with an index of a first- 
stage particular parent document and an index of a 
second-stage particular parent document by a 
retrieval result displaying unit shown in Rg. 8; 
Rg. 10 is a blodt diagram of a hypertext retrieving 
apparatus according to a fourth embodiment of the 
present invention; 

Rg. 1 1 is a block diagram of a hypertext retrieving 
apparatus according to a fifth embodiment of the 
present invention; 

Rg. 12 shows an example of a relrieral result in 
which an index of one particular hypertext docu- 
ment is displayed with a summary of the particular 
h^ertext document, an index of a first-stage partic- 
ular parent document and an index of a second- 
stage particular parent document for each particu- 
lar hypertext document by a retrieval result display- 
ing unit shown in Fig. 11; 

Fig. 13 is a block diagram of a hypertext retrieving 
apparatus according to a sixth embodiment of the 
present invention; 

Rg. 14 is a block diagr^ of a hypertext retrieving 
apparatus according to a sevwith embodimant of 
ttie present invention; 

Rg. 15 is a block diagram of a hypertext retrieiring 
apparatus according to an eighth embodiment of 
the present invention; 

Rg. 16 is a blodtdiayam of a hypertext retrieving 
apparatus according to a ninth embodiment of tfie 
present inveriGon; 

Rg. 1 7 shows Ihedivision of a long hypertext docu- 
ment mth one or more reference labels; 
Rg. 18 is a block cBagram of a hypertext ret-ieving 
apparatus according to a tenth embodiment of the 
present invention; 

Rg. 19 ^K3ws an example of a retrieval result in 
which indexes of hypertext documents and bwttons 
corresponding to a plurality of high-ranking related 
worcte are displayed, according to the tenth embod- 
iment; 

fMg. 20 is a block diagram of a hypertext retrieving 
apparatus according to an eleventh embodiment of 
the present invention; and 
Rg. 21 shows an example of a retrieval result, in 
which indexes of hypertext documents and buttons 
corresponding to a plurality of high-ranking related 
wwds are displayed, according to the eievaith 
embodiment. 
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DETAILFn HFRnRIPTinM pp JHF FMRt; ipiMFMTC 

Preferred embodiments of a hypertext document 
retrieving apparatus, in whKrfi one or more particular 
hypertext documents likely to meet a user's retrieval s 
request are retria/ed from a large volume of hypertext 
documents disfributively managed in a world vwde web 
of an internet are described with reference to drawings 
according to the concept of the present invention. 

Fig. 2 shows a reference relationship among a plu- w 
ralily of hypertext documents distrlbutively managed in 
a world wkie weto of an internet. 

As ^Town in Fig. 2. a plurality of hypertejct docu- 
ments D80 to D86 distrlbutively managed in a world 
wide web of an internet have a referential relationship is 
with each other. That is, an anchor sentence S800 is 
placed in the hypertext document D80, an anchor sen- 
tence S801 is placed in the hypertext document D81 an 
anchor sentence S802 is placed in the hypertext docu- 
ment D82, a plurality of anchw sentences Sa03 to S805 20 
are placed in the hypertext document D83. and an 
anchor sentence S806 is placed in the frypertext docu- 
ment D84. In each of the anchor sentences, either an 
Identifier identifying a document to wrfiich a user can 
make reference or a position of a document to which a 2S 
user can make reference is buried. 

A document to which a user makes reference is 
called a reference document in this specifioalion. and a 
document having one anchor sentence which indicates 
one or more reference documents Is called an parent so 
document in this specSioatfon, Also, each anchor sen- 
tence is composed ol one sentence or a plurality of sen- 
fences. 

Therefore, when a us& reads the parent document 
D81 displayed on a display of a browsed documen! 35 
selecting means (called a browser) and points out a 
position of the anchor sentence S801 of the parent doc- 
ument D81 by using a so-called pointing device, the ref- 
erence document 083 is called and displayed, so that 
the user can efficiently use the distributed hypertext 4o 
documents D80 to D86. 

A group of the hypertext documents D80 to D86 is 
written in a hypertext mark-up language, and each 
hypertext document is called a page, and a character 
string, an image or a progr^i is written in each hyper- 45 
text document. For eacamplB. in cases where the parent 
document D81 is stored in a fHe named "farmer.html", 
the reference document D83 Is stored In a file named 
"appfe.html" and an indicator (or a document storing 
position) indicating a reference to the reference docu- so 
ment D83 is buried in a character string "apple produc- 
ing farmer" written in the parent document D81 to frame 
the anchor sentence S801, the anchor sentence S801 
is aqsressed by "< a href="apple.html"> apple produc- 
ing farmer (/a > in this case, because any sentence is ss 
not written in the reference document D83, there is a 
case that the document D82 is prepared in a computer 
placed far from another computer, in which the docu- 
ment D83 prepared before the preparation of the docu- 
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ment DSI is stored, and the document DS2 functiiMs as 
an parent document for the reference document D83. 

(First Embodiment) 

Fig. 3 is a bfock diagram of a hypertext retrieving 
apparatus according to a first embodiment of the 
present invention. 

As shown in Fig. 3, a fiypertext retrieving apparatus 
1 for retrieving one or more hypertext documents likely 
to meet a user's retrieval request from a large volume of 
hypertext documents stored in a hipertext document 
managing unit 8 in which the hypertext documents pre- 
pared in a large number of computers widely distributed 
in a network of a worM wide web are disfributively man- 
aged on condition that the hypertext documents have 
reference relationships with each other, cornprises 

a hypertext document table with parent document 
list preparing unit 7 for analyzing the hypertext doc- 
uments having the reference relationships which 
are managed by the hypertext document managing 
unit 8, preparing hypertext document information in 
whi<^. one or more parent document identifiere 
identifying one or more p^ent documents and 
anchor sentences of the parent doamients are 
listed with one hypertext document Identifier identi- 
fying one hypertext document and a document stor- 
ing position of the hypertext document, for each of 
tile hypertext documents, and preparing a hyper- 
text document table of the hypertext document 
infonnation for all hypertext documents managed 
by the hypertext document managing unit 8. a 
retrieval index preparing unit 6 having a dictionary 
for analyzing a body of one hypertext document, a 
title of the hypertext document and character 
strings of one or more anchor sentences of one or 
more parent documents corresponding to the 
hypertext document in advance for each of the 
hypertext documents managed by the hypertext 
document managing urwt 8 according to the h^ier- 
lext documem Jabie prepared by the hypertext doc- 
ument table with parent documerrt fist preparing 
unit 7 to recognize a plurality of words appearing in 
the hypertext documents, preparing a piece of wotd 
information for one word in which one occurrence 
documem identifier identifying one hypertext docu- 
ment, in which the word registered in the dictionary 
appears, and positional information indicating 
occun-ence positions of the word In the title of the 
hypertext document, the body of the hypertext doc- 
ument and the anchor sentences of the parent doc- 
uments corresponding to the hypertext document 
are listed for each of the hypertext documents, and 
preparing a retrieval index of pieces of word infor- 
mation for the words stored in the dictionary, 
a keyword input unit 2 for receiving a plurality of 
keywords input by a user 9, 
a retrieving unit 3 for retrieving a plurality of pieces 
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of particular word information corresponding to a 
plurality of particular words agreeing with the key- 
words received in the keyword input unit 2 from the 
retrieval index prepared in the retrieval index pre- 
paring unit 6 and retrieving particular occurrence s 
document identifiers identifying particular hypertext 
documenls, in which one particular word agreeing 
with one keywotti appears, and particular positional 
information indicating particular occurrence posi- 
tions of one particular word in the particula- hyper- i 
text documents and a plurality of particular parent 
documents con-esponding to the particular hyper- 
text documents from the particular word information 
for each of the particular words, 
a document ranking determining unit 4 for unifying i 
one particular hypertext document and «ie or more 
particular parent documents correspondirtg to the 
particular hypertext document to a unified particular 
hypertext document according to the document 
information of the hypertext documait t^le pre- - 
pared by the hypertext document table with parent 
document list pffeparing unit 7 tor each of the partic- 
ular hypertext documents obtained in the retrieving 
unit 3. calciilaJing an occurrence frequency TF of 
one particular word in one unified particular hyper- 
text document for each particular word and each 
unified particular hypertext docunfwnt, calculating 
an inverse documerrt frequency IDF defined as an 
inverse value of the number of particular hypertext 
documents, in which one particular word appears, 
for each particular word, calculating a product ■ 
1P*IDF of one occurrence frecpiency TF and one 
inverse document frequency IDF, sumnrarjg a plural- 
ity of products for all particular words to produce a 
summed product as an estimated value for each 
unified particular hypertext document, determining 
a plurality of importance degrees of tiie unified par- 
ticular hypertext documents according to the esti- 
mated values, determining the ranking of the 
particular hypertext documents according to the 
impOTlance degrees for the unified particular hyper- 
text documents and preparing an index of one par- 
ticular hypertext docwnent fw each of the particular 
hypertext documents, and 
a retriwal result displaying unit 5 Ibr disptajring the 
indffltes of the partiodar hypertext documents in the 
ranked oider determined in the document ranWng 
detertrtnhig urHt 4 as a relrieral result. 

In the above configuration, an operation of the 
hypertext retrieving apparafeis 1 is deserved. A plurality 
of hypertext documents having reference relationships 
vwith each other are prepared in a large number of com- 
puters widely distributed in a network of a world wide 
web. In the hypertext document managing unit 8. the 
hypertext documents are distributively managed. The 
reference document table with parent document prepar- 
ing unit 7 has a related document collecting function 
(genwally called a web rtM) . Therefore, when a plural- 



ity of doouinenl storing position addresses (gwierally 
called a plurality of universal resource locators) of a plu- 
rality of hypertext documents are given to the reference 
document taWe with parent document preparing unit 7, 
the plurality of hypertext documents are indicated as a 
plurality of parent documents by the universal resource 
locator one after another, one or more anchor sen- 
tences written in each of the parent documents are ana- 
lyzed, and one or more reference documents are 
crtlected for each of the parent documents. Thereafter, 
a plurality of hypertext document idwttfiers not over- 
lapped with each other are ^located to the collected re^ 
erence documents in the order of cdlleclion to identify 
the collected reference documents. In this case, when 
any image or program is not written in each of the col- 
lected reference documents and a character string is 
written in each of tiie collected reference documents, a 
collecting time can be saved. Also, a plurality of docu- 
ment storing position addresses of the collecled refw- 
ence documents are listed to prohibit that one collected 
reference document feted is again collected. Therefore, 
as shown in Fig. 2. though not only the parent document 
D83 relates to the reference document D84 according 
to the anchor sentence S803 but also the parent docu- 
ment D84 relates to the reference document D83 
according to the anchor sentence S806. it is prohibited 
that the hypertext documents D83 and D84 are col- 
lected twice. 

Thereafter, a liypertfixt document table of pieces of 
30 hypertext document iinformation (refer to Rg. 4) in which 
parart document identHiers of one or more parent doc- 
uments and andwr sentences of the parent documents 
are listed for each hypertext document is prepared in 
the hypertext document table with parent document list 
35 preparing unit 7 according to a following procedurg. A 
plurality of document information entry spaces DS1 to 
DS3 of which the number is equal to the number of col- 
lected reference documents are prepared. In each of 
ttie document information entry spaces, the number of 
40 one hypertext document identifier identrf^ng one col- 
lected reference document and one document storing 
position address of the collected reference document 
are written in the document information entry space. 
Thereafter, a titie of the collected reference document is 
4s extracted from the collected r^erence document by 
examining a plurality of character strings written in the 
collected reference document. In this embodiment, a 
litie 'spp\e that I grew" is, ibr example, extracted from a 
character string "(tifle ) apple that 1 grew (title ) and the 
so titie Is written in the document information entry space. 
Thereafter, one or more character strings of hypertext 
mat1<-up language tags respectively denoting a charac- 
ter string placed between T and ") " are removed from 
a plurality of character strings existing in a body of tiie 
55 oollscted reference document to form a text body, and 
the text body is written in the document information 
entry space. Tfiereafter, it is checked whether or not 
one or rtwre anchor sentences retating to one reference 
document exist in one at more parent documents relat- 
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ing to Ihe reference document. In cases where an 
artchor sentence exists in an parent document relating 
to one refererwe dbcument, a set of an parent docu- 
ment identifier identifying ttie parent documwrt and the 
anchor sentence of the parent document is written in 
the document information entry ^aoe to form an parent 
document list lor each hypertext document information. 
Also, a plurality of words used in the text body, the titfe 
and the anchor sentences are written in the document 
information entry space to form a woid list for each 
hypertext document Information. 

Thwefare, in the reference document table with 
parent document preparing unit 7. as shown in Hg. 3. a 
document information entry space is prepared for each 
of the hypertext documents managed by Ihe hypertext 
document managing unit 8, a h^jertext document iden- 
tifier, a document storing position, a title, a tesxt body, an 
parent documHit list and a word list are written in each 
of the documem information entry spaces to prepare a 
hypertext document table. 

in this embodiment, the hjfljertext document table is 
prepared after one or more anchor sentences written in 
each of the parent documents are analyzed to collect 
the reference documents. Therefiare. the anchor sen- 
tences are anatyzed or checked twice to determine the 
collected reference documents and prepare the hyper- 
text document table. However, in cases where the 
hypertext document table is prepared while analyzing 
the anchor sentences to collect the reference docu- 
ments, the hyperte)d documwt table can be efficiently 
prepared 

Thereafter, in the retriei/al index fH-eparing unit 6 
having a dictionary, a body of a fiyperlext document, a 
title of the hypertext document and character strings of 
one or more anchor sentences of fte hypertext docu- 
ment are aialyzed in advance for each of the hypertext 
documents of the hypertext documwrt table, a piece of 
wond information composed of a woid, one or more 
occurrence document identifiere identifying tiypertext 
documents, in which the word appears, and positiraial 
information.indicating occun'ence positions of ttie word 
in the hypertext documents is prepared for each of a 
plurality of words stored in the dictionary, and a retrieval 
index of pieces of word irtwrnatlon for the i^urality of 
words is prepared as shown in Rg. 5. 

in detail, tens of thousands words are registered in 
the dictionary of the retrieval index prepanng units, and 
a plurality of word information entry spaces WS1 to 
WS3, of which the number is equal to the number of 
words registered in the dictionary, are prepared, and 
each of the words is written in one of the word informa- 
tion entry spaces WSl to WS3. Thereafter, a word reg- 
istered in the wond list of one document information 
enfry space of the hypertext document table is detected 
as a particular word, a hypertext documem identifier of 
a particular hypertext document corresponding to the 
document information entry space is detected as an 
occurrence hypertext document identifier, one or more 
positions of the particular word in the particular hj^er- 



texf document are detected as positional information, 
and a set of the occurrence hypertext document identi- 
fier and the positional information is written as word 
information in a particular word information entry space 
s conre^3onding to the particular v^ord. This processi ng is 
perfomied for eadi of the words registered in tfie word 
lists of all document infomiatfon entry spaces of the 
hypertext document table, so that a retrieval index of the 
pjeces of word information corresponding to a plurality 
10 of words used in the hypertext documents is prepared. 
Fig. 5 shows a piece of word information of the 
retrieval index which is written in the word information 
entry space WSl and corresponds to a word "apple". 
"{Tifle,1)" indicates that the word "apple" appears in the 
IS first word position of the title of the hypertext document 
D83, "^y,4,33.43)" indicates that the word "apple" 
^ears in the fourth, 33'th and 43-fh woid positions of 
the body of the hypertext document D83, "(000081 1)" 
indicates that the word "apple" appears in the first word 
20 posrtiCHi of the anchor sentence S801 of the hypertext 
document DS1 functioning as the parent document and 
"(000082,4)" indicates that ttie word "apple" appears in 
me fourth word position of the anchor sentence S802 of 
the hypertext document D82 fwicfioning as Ihe parent 
25 document. 

Also, it is applicable that an inverse value of Uie 
number of occurrence documents in which a word 
appears (generally called an inverse document fre- 
quency IDF^ and the occurrence frequency of the word 
30 in each of the occun-ence documents (generally called a 
text frequency TF=) be calculated in advance in the 
retrieval index preparing unit 6 and written in a corre- 
sponding word information entry space for each of the 
vrords. Therefore, a processing time required far the 
3s retrieval can be Shortened. 

Therefore, in the retrieval index preparing unit 6, 
each of the words appearing in the text body of the 
hypertext document, the title of ttie hypertext document 
and the anchor sentences of the parent documents 
40 relating to the hypertext document is analyzed, and an 
occurrence document list composed of one or more 
occurrence document identifiers and the positional 
information is prepared for each wwd. Accordingly, a 
retrieval index in which word appearing positions' in 
4S each of the hypertext documents are indicated for each 
word can be prepared. 

TTie keyvrord input unit 2 has a function of a text box 
and a retrieval starting button for returning contents of 
the text box, and an HTML document written according 
50 to the hypertext mark-iflD language having a title such as 
"retrieval page" is employed for the keyword input unit 2. 
That is, the user 9 calls the HTfviL document in the world 
wide web browser such as fWosaic or Netscape oper- 
ated in his own client computer, a single keyword is 
ss input to the text box or a plurality of keywot^fs divided by 
spaces are input to the text box, and the retrieval start- 
ing button is pushed. Tlie-efore. the single keyword or 
keywords are input. 

Therefore, a plurality of keywords input by the user 
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9 are receivsd in the keyword input unit £ and are trans- 
mitted to the retrieving unit 3. in this embodiment, tiie 
user inputs each of the l<eywords by pushing a plurality 
of keys arranged on a keyt5oard. However, in cases 
where each of a plurality of candidates for a keyword is 
selected by pushing a button, a keyword input operation 
using the pointing device can be easily performed with- 
out UHng any keyboard even though an unskilled per- 
son operates the keyword input unit 2. 

In the retrieving unit 3, pieces of particular word 
information corresponding to a plurality of particular 
words, which agree with the keywords input by the key- 
word input unit 2, are extracted from the retrieval index 
stored In the retrieval index preparing unit 6. and one or 
more occurrence document identifiers identifying one or 
more particular hypertext documents, in Which one par- 
ticular word agreeing with one keyvirord aj^ears, and 
positional intormation indicating positions of flie particu- 
lar word in the particular hypertext documents are 
obtained from one piece of word Information for each of 
the particular words. A plurality of sets of the occurrence 
document identifiers and the positional information are 
transmitted to *ie document ranWng determining uret 4. 

In the document ranWng determining unit 4. pieces 
of hypertext document information corresponding to the 
particular hypertext documents Identified by the occur- 
rence document Wenttfiws are ©dracted from the 
hypertext document table, and one particular hypertext 
document and one or more parent documents identified 
by one or more parent document identifiers listed in one 
piece of hypertext document infbrmation correspond&ig 
to the particular hypertext document are unified to an 
unified particular hypertext document. The unified par- 
ticular hypertext document is formed for eadi of the par- 
ticular hypertext documents which are identified by the 
occurrence document identifiers transmitted from the 
retrieving unit 3. Thereafter, an inverse document fre- 
quency IDF defined as an Inverse value of the nunnber 
of unified particular hypertext documents in whidi one 
particular word agreeing with one keyword af^ears and 
the occurrence frequency TF of one particular woitl in 
each of the unified particular hypertext documents are 
calculated for each of the parecular words according to 
the plurality of sets of the occurrence document identifi- 
ers and the positional iniormafion. The inverse docu- 
ment frequency IDF denotes a oon«ctlon factor for each 
particular word. 

Thereafter, in cases where one keyword is only 
input, an estimated value djtained by multifrfying the 
Inverse document frequency IDF for one particular word 
and the occurrence frequency TF together is calculated 
as an importance degree for each of the unified particu- 
lar hypertext documents. Also, in cases where the 
number of keywords input by the user is two or more, a 
product TF*IDF of one occunrence frequency TF and 
one inverse document frequency IDF is calculated for. 
each keyword and each unified particular t^^pertext doc- 
ument, a sum of the products calculated for all keywords 
is adopted as an estimated value for each of the unified 



particular hypertext documents, and an importance 
degree for each of the unified particular hypertext docu- 
ments is determined according to the estimated values. 
The importance degree for each unified particular 
5 hypertext document is set as an importance degree for 
one particular hypertext document corresponding to the 
ursTied particular hypertext document. Thereafter, the 
ranking of the particular hypertext documents including 
the parent documents is determined according to the 
JO inportance degrees of ttie particular hypertext docu- 
ments. 



In cases where the number of keywords is two or 
more, it is appticable that an estimated value for one 
particula- hypertext document be set to a value N times 
16 (N is two or more) as high as a sum of the products 
TF*IDF cateulated lor at! keywords when N particular 
words agreeing with N keywords appear in the particu- 
lar hypertext document. In tills case, because the con-e- 
laBon among the N keywords is reflected on the 
20 importance degree for each particular hypertext docu- 
ment, the user's retrieve request can be moreover sat- 
isfied. 

Also, in cases where two particular words agreeing 
with two keywords are used in one particular hypertext 
25 document close to each other within 20 characters, it is 
applic*le that an estimated value for tiie unified partic- 
ular hypertext document be doubled. In this case, 
because the correlation between the two kaywwds 
close to each other is reflected on the importance 
so degree for each particular hypertext document, the 
user's retrieval request can be moreover satisfied. 

Thereafter, in the document ranking determining 
uriH 4. an HTML document, In which a plurality of 
indexes of the particular hypertext documents are listed 
35 in the ranked order, is prepared and transmitted to the 
retrieval result displaying unit 5. in this case, the index of 
one particular hypertext document is a title of the partic- 
ular hyperteia document or a character string of an 
anchor sentence written in one of the parent docu- 
40 menls, and a document storing positron address indicat- 
ing a position of the particular hypertext document in tiie 
hypertext document managing unit 8 is buried in the 
index of the particailar hypertext doaiment, and the 
index functions as an anchor sentence. That is, when 
45 the user selects one index of one particular hypertext 
document, tiie particular hypertext document is called 
from the hypertext document managing urat 8 according 
to the document storing position address. 

Thweftore, in the document ranking determining 
so unit 4, one or more parent documents having a refer- 
ence relationship with one particular hypertext docu- 
mHit are exb-acted from the hypertext document table 
prepared in the reference document table with parent 
document preparing unit 7 for each particular hypertext 
55, / docum«sit, one particular hypKtext document and one 
T ■ or more parent documents having a reference relation- 
ship with the particular hypertext doojment are unified 
to a unified p^icdar hypertext document for each par- 
ticulw hypertext document, an importance degree of 
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the particular hypertext document including the psc&tl 
documents is determined according lo an estimated 
value TF1DFN for each particular hypertext document, 
the particular hypertext documents are ranked accord- 
ing to the those inportance degrees, and the particu lar s 
hypertext documents are listed in the ranked order. 

In this embodimenl. the occurrence frequerjcy TF of 
the word is not normalzed because the occurrence fre- 
quency TF is not divided t>y a size of one unified partic- 
ular hypertext document. Howei/er, in cases where the 10 
occurrence frequency TF of the word is normalized by 
dividing the occun-ence frequency TF by a size of one 
unified particular hypertext document, it is required that 
a size of each hypertext document is written in the 
hypertext document table. is 

The rsfrieral result displaying unit 5 is embodied by 
the workJ wide web browser such as Mosaic or Net- 
scape operated in his own client computer. The HTIUIL 
document prepared in the document ranking determri- 
ing unit 4 is displayed on a display of the client compu- 20 
tw. Thereafter, when the user selects one index of one 
particular hypertext document tabled in the HTML docu- 
ment by using a pointing device, a position of the partic- 
ular hypertext document selected by the user is 
ascertained according to the document storing position 25 
address burled in the Index of the particular hyperteart 
document, and the particular hypertext document is 
called from the hypertext document managing unit 8. 

Therefore, in the retrieval result displaying unit 5, 
the indexes of the particular hypertext documents listed so 
in the HTML document are displayed, the user selects 
one index of one particular hypertext document, and the 
particular hypertext document selected . by the user is 
called from the hypertext document managing unit 8. 

Accordingly, because one or more parent docu- 3S 
ments having a reference relationship virith each refer- 
ence document are listed in the hypertext document 
table prepared by the reference document table with 
parent document preparing unit 7, the parent docu- 
ments corresponding to one reference document can 40 
be specified by extracting the document informabon cor- 
responding to the reference document from the hyper- 
text document table. Therefore, because it. is not 
required to ask the hypertext document managing unit 8 
for one or more parent documents corresponding to the 45 
reference document, one or more parent documents 
corre^onding to each refwence docmient can be 
quickly ascertained. 

Also, because one particular hypertext document 
and one or more parent documents having a reference so 
relationship with the particular hypertext document are 
unified as an unified particular hypertext document in 
the document ranking determining unit 4, an importance 
degree can be determined for each of the unified partK- 
ular hyperteja documents. Therefore, ttie ranking of the 55 
particular hypertext documents in wrtiich one particular 
word agreeing with arte keyword appears can be deter- 
mined according to fiie inTportaice degrees while con- 
sidering the parent documents con-eqaonding to each 



particular hypertext document. Accordingly, the indexes 
of the particular hypertext docum«its can be displayed 
by tfte retrieval result displaying unit 5 according to »ie 
ranking of the particular hypertext documents on condi- 
tion that the user's retrieval request expressed by the 
keyword is reliably satisfied, and the user can selects 
the particular hypertext documents in the ranked order. 

Also, because one hypertext document and one or 
more anchor sentences of one or more parent docu- 
ments having a reference rdationship with the hypertext 
document are listed in each piece of document informa- 
tion of the hypertext document t^lepr^ed by the ref- 
erence document table with parent document preparing 
unit 7, each piece of word information of the retrieval 
index indicating that a word appears in one hypertext 
document and one or more anchor sentences of one or 
more parent documents having a reference relationshp 
with the hypertext ctocument can be easHy prepared in 
the retriewal index preparing unft 6. In aifclition, because 
one or more parent documents having a reference rela- 
tionship with each reference document are listed in the 
hypertext document table prepared by the reference 
document table with pa-ent document preparing unit 7. 
when the retrieval Index is prepared in the retrieval 
index preparing urtit 6, it is not required to askthe hyper- 
text document managing unit 8 for one or more parent 
documents corresponding to the reference document. 
Therefore, the retrieval index can be quickly prepared. 

(Second Embodiment) 

Fig. 6 is a block diagram of a hypertext retrieving 
apparatis according to a second embodiment of the 
present invention. 

As shown in Rg. s. a hypertext retrieving apparatus 
1 1 for retrieving one or more hypertext documents likely 
to meet a user's retrieval request from a large volume of 
hypertext documents stored in the hypertext document 
managing unit 8, comprises the hypertext document 
table with parent document list preparing unit 7, the 
retrieval index pr^jaring unit 6, the keyword input unit 2, 
ttie retrieving unit 3, 

a document ranking determining unit 12 for unifying 
one particular hypertext document and one or more 
particular parent documents corresponding to the 
particular hypertext document to a unified particular 
hypertext document according to the document 
infonnation of the hypertext document table pre- 
pared by the hypertext document table with parent 
document list preparing unit 7 for each of the partic- 
ular hypertext documents obtained in the retrieving 
unit 3, calculating estimated values for the unified 
particuiar hypertext documents according to the 
particul^ word infra-mation of the retrieval index 
obtained in the retrieval index preparing unit 6, 
determining a plurality of importance degrees of the 
unified particidar hypertext documents accoKiing to 
tiie estimated values, determining the ranking of 
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the particular hypertejct documents according to the 
importance degrees for the unif ied particular hyper- 
text documents and preparing an index of one par- 
ticular hypertext document with an index of a 
particular parent document corresponding to the 
particular hypertext document for eadi of the par- 
ticular hypertext documents, and 
a retrieval result displaying unit 13 for displaying the 
index of the particular hypertext document with the 
index of the particular parent document for each of 
the unified particular hypertext documents in the 
ranl^ed order determined in the document ranldng 
determining unit 12 as a retrieval result. 

In the atKJve configuration, after the ranking of the 
particular hypertext documents is determined according 
to the importance degrees in the document ranking 
determining unit 12 in the same manner as in the first 
enAiodiment, not only an index of one particular hyper- 
text document but also an index of a particular parent 
document corresponding to the particular hypertext 
document are prepared for each of the particular hyper- 
text documents. In cases where a priuraJity of parent 
documents corresponding to the parficular hypertext 
document exist, one parent document of which the doc- 
ument storing position Is dosest to thai of the particular 
hypertext document among those of the parent docu- 
ments is selected as the particidar parent document 
This selection is performed by ccHnparing a portion of a 
character string indicating the document storing posi- 
tion of each parent document with a portion of a charac- 
ter string indicating tiie document storing position of the 
particular hyperteoct document. Also, In this enntoodi- 
ment, the particular parent document (or a first-stage 
particular parent document) is regarded as a second- 
stage reference document, a secwid-stage particular 
parent document having a reference relationship with 
the second-stage reference document is specified, and 
an index of the second-stage particular parent docu- 
ment is prepared. Thereafter, the index of one particutar 
hypertext document is displayed with the index of the 
first-stage particular parent document and the index of 
the second-stage particulaitt)arent document for each 
particular hypertext document by the retrieval result dis- 
playing unit 13, 

Fig. 7 shows an example of the index of one partic- 
ular h)^ertext document displayediwith the index of the 
first-stage particular parent document and the index of 
the second-stage particular parent document for each 
particular hypertext document by the retrieval result dis- 
playing unit 13. 

As ^wn in Rg. 7. in cases where the fourth rank 
Is given to Wie hypertext document D83, the 18-th rank 
is given to the hypertext document DS5 and the 19-th 
rank is given to the hypertext document D86, the index 
of the particular hypertext document D83 is displayed 
with the index of the first-stage particular parwrt docu- 
ment D81 and the index of the second-stage particular 
parent document P80 as a fourth ranking group, the 
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index of the particular hypertext document D85 is dis- 
played with the index of the first-stage particular parent 
document D83 and the index of the second-stage par- 
ticular parent document D81 as a 18-th ranking (poup, 
5 and the index of the particular hypertext document D86 
is displayed with the inxiex of the first-stage particular 
parent document D83 and the index of the second- 
stage particular parent dooimwit D81 as a 19-th rank- 
ing group. 

10 Accondingty, even though the hypertext document 
D86 having no anchor sentence is selected as one par- 
ticular hypertext document, the hypertext document 
D83 or D81 having a close relation wfith the hypertext 
document D86 can be easily selected and caRed from 
15 the hypertext document managing unit 8 without relyirig 
on any anchor sentence. That Is, because a plurality of 
hypertext documents having a reference relationship 
with each otier closely relate to each other, the display 
of the indexes of the first-stage and second-stage par- 
se tkailar parent document is very useful fbr the user. 

(Third Embodiment) 

In the first or second embodiment, in cases where 
25 the hypwtfixt doaiment D83 of the fourth rank is called 
and read, the hypertext dwument D85 is called and 
read by selecting the position of tine anchor sentence 
S804 and a plurality of hypertext documents of lower 
ranks following tiie fourth rank are called and read one 
so by one, there is a probability that the hypertext docu- 
ment 085 of the 18-th rank is erroneously called and 
read ag^n because the user forgets the reading oJ the 
hypertext document D85 though the hypertext docu- 
ment D85 has been already read. Also, even though the 
35 hypertext document D86 of the 1 9-th rank is called and 
read, because a long time elapses after the hypertext 
document D83 of the fourth rank is called and read, 
there is a probability that the user cannot understand 
context of the hypertext document D86 dosely relating 
40 to context of the hypertext documerft D83. Therefore, to 
solve the above drawbacks in the third embodiment, the 
tarks given to a pluraDty of hypertext documents closely 
reiabng to each other are set to the same rank. 

Fig. 8 is a blodt diagram of a hypertext retrieving 
46 apparatus according to a third embodiment of the 
present invention. 

As shown in Fig". 8, a hypertext retrieving apparatus 
21 tor retrieving onfe or more hypertext documents likely 
to meet a users retrieval request from a large volume of 
50 hypertext documents stored in the hypertext document 
firanaging unit 8, comprises the hypertext document 
table with parent document list preparire; uttit 7, the 
retrieval index preparing unit 6, the keyvrord ir^iul unit 2, 
the retrieving unit 3, 

55 

a document ranking detemilning unit 22 for unifying 
one particular hypwtextdocumwrt and one or more 
particutar parent documents corresponding to the 
particular hypertext document to a unified particular 
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hypertext document according to flie document 
information of tiie hypertext document table pre- 
pared by the hypaiext document table with parent 
document fist preparing unit 7 for each of ttie partic- 
ular hypertsct documents obtained in the retrieving 
unit 3, calculating estimated valu^ for the unified 
particular hypertext documents according to the 
particular word irrformation of the retrieval index 
obtained in the retrieval index preparing ut^'t 6, 
determining a plurality of importance degrees of the 
unified particular hypertext documents according to 
the estimated values, determining the ranking of 
the particular hypertext documents according to the 
importance degrees Ibr the unified particular hyper- 
text documents on condition that ranks given to two 
or more p^cular tvpertext documents closely 
relating to each other are set Id the same rank and 
preparing an indec of one parltcidar hypertext doc- 
ument for each of the particular hypertext docu- 
ments, and 

a retieval result di^laying unit 23 for displaying the 
indexes otthe particular hypertext documents in the 
ranked orda- detamined in the document ranking 
determining unit 22 as a retrieval result on contfifion 

..that two or more particular hypertext docummts set 
to the same rank are di^layed with one or more 
particular parent documents corresponding to any 

.;of the particiiar hypertext documenls in common in 
a group. 

.In the above cor*"guralion, after the importance 
degrees of the particular hypertext documents are cal- 
culated and the ranking of the particular hypertext doc- 
uments is deterntined according to the importance 
degrees in the document ranking determining unit 22 in 
the same manner as in the first embodiment, one or 
more parent document identifiers listed in one piece of 
document information of the hypertext document table 
corre^nding to one particular hypertext document are 
extracted, and one or more parent documents identified 
by the parent document identifiers are specified for each 
particular hypertext document. Thereafter, it is judged 
whether or not each of the parent documents agrees 
with one of the particular hypertext documents. In cases 
where one parent document corresponding to a first 
particular hypertext document of a rank A agrees with a 
second partcular hypertext document of a rank B, it is 
judged that the first aid second particular hypertext 
documents closdy r^ate to each other, and the fret and 
second particular hypertext documenls are reset to a ■ 
higher rank between ttie ranks A and B. Thereafter, 
indexes of the particular hypertext documenls are dis- 
played in the ranked order by ttie retrieval rasult display- 
ing unit 23. 

For example, because the parent document D83 
corresponding to the hypertext document D85 of the 18- 
th rank agrees vwth the hyperteia document of the 
four* rank, the hypertext document P85 is reset to the 
fourth rank. Also, because the parent document D83 



corresponding io the hypertext document D86 of the 19- 
th rank agrees vwth the hypertext document D83 of the 
fourth rank, the hypertext document D86 is reset to the 
fourth rank. 

"Therefore, because a plurality of particular hyper- 
text doojments closely relate to each other are set to 
ttie same rank and are displayed close to each other, 
the user can consecutively read the particular hypertext 
documents closely relate to each other, so that the user 
10 can easily realize the contexts of the particular hyper- 
text documents. Accordingly, it is prevented that the 
same particular hypertext document is erroneously read 
again, and the user can efficiently read a group of par- 
ticular hypertext documents closely relate to each other. 
is In this embodiment, a plurality of particular hyper- 
text documents closely relate to each other are set to 
the highest rank among the rante given to the plurality 
of parficulffl- hypertext documents. However, the third 
embodiment is not limited to tNs concept. That is. when 
20 a plurality of particular hypertext documents dosely 
relate to each other are determined, it is applicable that 
a sum of the importance degrees of the particular 
hypertext documents be cateulated and me particular 
hypertesd documents be reset to the same higher rank 
2B according to the summed importance degree. 

Also, it is pretend that the concept of the second 
embodmem and the concept of the tfiird errtbodment 
be combined. For example, as shown in Fig. 7, when a 
first group of the particular hypertext document D83 and 
30 the parent documents D80 and D81 is set to the fourth 
rank, a second group of the particular hypertext docu- 
ment D85 and the parent documents DSI and D83 is 
set to the 18-tti rank and a third group of the particular 
hypertext document DS6 and the parent documents 
3s D81 and D83 is set to the 19-th rank according to the 
second embodiment, the second group of documents 
D81 , D83 and D85 set to the 18-1h rank is reset to the 
fourth rank, and the third group of documents D81 . D83 
and D86 set to the 19-th rank is reset to the fourth rank, 
*? and a ccHnbined group of the particular hypertext docu- 
ments D83, D85 and D86 and the parent documents 
D80 and D81 reset to the fourth rank is displayed as 
shown in Fig. 9. 

« (Fourth Embodiment) 

.In general, a special word indicating a feature of a 
reference document appears many times in one or more 
anchor sentences of one or more parent documents 
so corre^onding to the reference document. Therefore, in 
cases where an estimated value for the reference docu- 
ment is calculating by considering the special word 
appearing in the anchor sentences of the parent docu- 
ment and the reference document is ranked according 
55 to the estimated value, reliability for the retrievaf of a plu- 
rality oi hypertext documents likely to meet a user's 
retrievai request can be inproved. 

Fig. 10 is a block diagram of a hypertext retrieving 
apparatus according to a fourth emJxxiiment of the 
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present uweniion. 

As shown in Fig. 10. a hypertext retriwing appara- 
tus 31 for retrieving one or more hypertext documents 
likely to meet a user's retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext s 
document managing unit 8, comprises the hypertext 
document table with parent document list pr^jaring uni 
7, the retrieval index preparing unit 6, the keyword input 
unit 2, ttie retrieving unit 3, 

10 

a document ranking determining unit 32 for calcu- 
lating an occurrence frequency of each particular 
word in one particular hypertext document and one 
or more anchor sentences of one or more particular 
parent documents corresponding to the particular is 
hypertext documerrt as a revised occurrence fre- 
quency TF for the particular hypertext dociHuent for 
each of the particular hypertext documents accord- 
ing to the particular word infbnnation of the retrieval 
index obtained in the retrieval index preparing unH so 
6, calcuiating estimated values of the particular 
hypertext documents according to the revised 
occurrence frequencies TF and inverse document 
frequencies IDF, deternining a plurdity of Impor- 
tance degrees of the particular hypertext docu- ^ 
ments according to the estimated v^ues, 
determining the raring of the particular hypertext 
documents according to the importance degrees 
and preparing indexes of the p^cular hypertext 
documents, and so 
a retrieval result cEsplaying unit 33 for displaying the 
indexes of the particular hypertext documents in the 
Talked CNTder determined in the document ranking 
deternining unit 22 as a retrieval result. 

3S 

In the above configuration, in cases where the user 
input a keyword "apple", as shown in Fig. 4. the particu- 
lar word "apple" appears four times in the title of the 
hypertejd document D83 and the body of the hypertext 
document D83. Also, the partiaJar word "apple" 4o 
a(^ears in the anchw sentence Sa01 of tie parent doc- 
ument D81 and the anchor serSenoe 8802 of the parent 
document D82. Therefore, because a sum of an occur- 
rence frequency of the particular word "apple" in the 
hypertext document D83 and the anchor sentences 4s 
S801 and S80a of the parent documents D81 and D82 
is 6, a revised occurrence frequency TF for the particu- 
lar hypertext document D83 is set to 6. and an esti- 
mated value of the particular hypertext document D83 is 
calculated by using the revised occun'ence frequency so 
TF in the document ranWng determining unit 32. 
Accordingly, the particular hypertext dccument D83 is 
ranked to a higher rank, so that reliability of the retrieval 
of the particular hypertext document D83 can be 
improved. 55 

{Fifth Embodiment) 

In the first to fourth embodiments, in cases where 



the user desires to know an outline of contents of one 
particular tiypertext document when an index of the par- 
flcular hypertext document is displayed, it is required to 
call the particular hypertext document from the h^ar- 
text document managing unit 8. Therefore, In cases 
where the user desires to read contents of many partic- 
ular hypertext documents, it is troublesome that the user 
call the particular hypertext documents. 

Fig. 11 is a t^ck diagram of a hypertext retrieving 
apparatus according to a fifth embodiment of the 
present invention. 

As shown in Rg. 1 1. a hypertext retrieving appara- 
tus 41 for retrieving one or more hypertext documente 
likely to meet a user's retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
document managing unit 8, corrprises the hypertext 
document table with parent document list preparing unit 
7, the retrieval index preparing unit 6. the keyword input 
unit 2, the retrieving unit 3, 

a document ranking determining unit 42 for unifying 
one particular hypertext document and one or more 
particular parent documents corresponding to the 
particdar hypertext document to a unified particutar 
hyperteid document acctutling to the document 
in16miatiw of the hypertext document t^le pre- 
pared by the hypertext documwit table with parent 
document list preparing unit 7 for each of the partic- 
ular hypertext documents obtained in the retrieving 
unit 3. calculating estimated values for the unified 
particular hypwtext doajments for each particular 
word according to the particular word informatfon of 
the retrieval index obtained in the retrieval index 
preparing unit 6, determining a plurality of impor- 
tance degrees ol the unified particutar hypertext 
documents according to the estimated values for 
each particutar word, determining the ranking of the 
particular hypertext documents according to the 
importance degrees for the unified particular hyper- 
text documents for each particular word, preparing 
an index of one particular hypertext docunent for 
each of the particular hypertext documents and 
preparing a plurality of summaries of the particular 
hypertext documents tor eadi of the particular 
words, and 

a retiieval result displai^ng unit 43 for displaying a 
grtyup of the indexes of the particular hypertext doc- 
lanents with the summaries of the particular hyper- 
text dorajments in the ranked order detemtined in 
the document ranking determining unit 42 for each 
parBojIar word as a retrieval result. 

In the above configuration, after the indexes of the 
particular hypertext documents are pr^ared in the doc- 
ument ranking determining unit 42, a particular sen- 
tence or a particular phrase including one particular 
word is extracted from one particular hypertext docu- 
ment according to the positional information of the word 
information of the retrieval index prepared by ttie 
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retrieval index preparing unit 6. and a summary In which 
the particular sentence or the particular phrase is writ- 
ten in succession to a top sentence or a top phrase of 
the particular hypertext document is prepared for each 
of the parficUlar words and each of the particular hyper- 
text documents, tn cases where a plurality of particular 
sentences or a plurality of particular phrases induding 
one particular word exist in one particular hyperlBxt 
document, a summary in which the particular sentences 
or the particular phrases arranged in the existing oider 
are written in succession to a tc^ sentence or a top 
phrase of the particular hypertext documerti is pre- 
pared. Thereafter, the indexes of the particular hyper- 
text documents with the summaries of the particular 
hypertext documents are displayed for each particular 
word by the retrieval result disjalaying unit 43 in me 
ranlted order determined in the dotajment ranking deter- 
mining unit 42. 

Accordingly, because the summary of one particu- 
lar h^ertext document is displ^ed for each of the par- 
ticular hypertext documents, the user can realize an 
outline of contents of each particular fiypertext docu- 
ment by reading the summary of each particular hyper- 
teKt document without calling each particular hypertext 
document frcim the hypertext document managing unit 
8, the user can easily select one or more particular 
hypertext documenis meeting a user's reWevai request 
In this embodiment, even though a particular sen- 
tence or a particular phrase including one particular 
word appears many times in one particular hypertext 
document, aJI particular sentaices or all particular 
phrases including the particular word are extracted from 
the particular hypertext dociment, and a summary is 
prepared. Howwer, in cases where a summary of one 
particular hypertext document obtained by connecting a 
series of particular sentences or a series of particular 
phrases of the particular hypertext document with a top 
saitence or a top phrase of the particular h^ertext doc- 
ument becomes too long, it is difficult for the user to 
quickly realize a long summary. Therefore, it is ^lica- 
bte that three particular sentences or three particular 
phrases of the particular hypertext dooumerrt be con- 
nected with a top sentence or a top phrase of the partic- 
ular hypertext document to prepare a summary for each 
particular word when the number of keywords input by 
the user is five or less, two particular sentences or two 
particular phrases of the partiatlar hypwtext document 
be connected with a top salience or a top phrase of the 
particular hypertext document to prepare a summary tor 
each particular word whan the number of keyworcis 
input by the user is ten or less, or one particular sen- 
tence or one particular frftrase of me particiJar hyper- 
text document be connected with a top sentence or a 
top phrase of the particular hyp^text document to pre- 
pare a summary for each particular wxd when the 
number of keywords irpul by ttie user is eleven or more. 
Therefore, it Is prevented that the summary becomes 
too long, aiKi the user csun efficiently read a nuntier of 
summaries di^layed by ttie retrieval result displaying 



unit 43. 

Also, it is prefen-ed that flie concept of ttie second 
embocf ment and the concept of the fifth embodiment be 
con^sned. For example, when a first group of the partic- 
5 ular hypertext document D83 and the parent documents 
D80 and D81 is set to the fourth rank, a second group of 
the particular hypertext document D85 and the parent 
dociflnents D81 and DS3 is set to the 18-th rank and a 
ttiird group of the particular hypertext document D86 
TO and the parent documents D81 and D83 is set to the 19- 
fh rank according to the second embotfiment. as shown 
in Fig. 1 S, a summary of the particular hypertext docu- 
• - ment D83 is added to the first group, a summary of the 
particular hypertext document D85 is added to the see- 
rs ond:group and a summary of the particular hypertext 
document D86 is added to the third group. 

(Sixth Embodiment) 

20 In the world wide web, a composition (or an article) 
is divided into a number of portions, and each portion of 
the composition is written in one hypertext document. 
Therefore, there is a case that a context of the composi- 
tion is not sufficiently ejq^ressed in one portion of the 
ss composition wiitten in one hypertext document. For 
example, though an apple grown in Aomori is described 
in the composition, the word "Aomori" indicating a pro- 
duction place of the ^pte is not written in the hypertext 
document D83 but is written in the parent doament 
30 D81. 

Therefore, in cases where a plurality of keywords 
esqjressing a context of a composition are separately 
used in a hypertext document and a plurality of parent 
■ documenis having a refefehce' relationship with the 
35 hypertext document, the hypertext docuruent is undesir- 
ably ranked to a lower class in the prior art. However, in 
the sixth embodiment, one combined hypertext docu- 
ment produced by combining a retrieval hypertext docu- 
ment (or a particular hypertext document) and one 
40 parent document having a reference relationship with 
the retrievat hypertext document is prepared for eadi of 
the paren^ documetTts. importance degrees of the com- 
bined hypertext documents are compared with each 
other,, one- combined ->iypertext document having the 
4s maximum importance degree is selected, and the max- 
. imum impprtanpe degree is used as an importance 
^ degree for the retrieval h^aertext document 

.-..Fifl, 13 is a block diagtam of a hypertext retrieving 
apparatajs accordtrig to a sixth embodiment of the 
so present invention. 

7 ■ As shown in Fig. 13, a hypertext retriaring appara- 
tus 51 for retrieving one or more hypertext documents 
likely to meet a user's retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
ss document managing unit 8, comprises the hypertext 
document table with parent document list preparing unit 
7, the retrieval index preparing unit 6, the keyword input 
unit 2. the retrieving unit 3, 
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a document ranking determining unit 52 for combin- 
ing one particular hypertext document and one par- 
ticular parent document corresponding to the 
particular hypertext document to form a combined 
particular hypertext document according to the doc- 5 
ument information of the hypertext documer* laWe 
prepared by the hypertext document table wth par- 
ent docunent list pfeparing unit 7 . lor each of the 
particular parent documents corre^ndlng to the 
partlciJar hypertext document and each of the par- ,0 
ticular hypertext documents obtained in the reiri sit- 
ing unit 3, calculating estimated values for the 
cwrtiined particular hypertext documents accord- 
ing to the particular word information of the retrieval 
index obt^ned ri the retrieval index preparing unit 6 is 
for each of the particular hypertext documwrts, 
determining a plurality of importance degrees of the 
combined particular hypertext documents accord- 
ing to the estimated values for each of the particular 
hypertext documents, comparing the mrtportance 20 
degrees of the combined particular hypertext docu- 
ments with each other for each of the particular 
hypertext documents, selecting a maximum impor- 
tar»e degree among ttie importance degrees of the 
combined parttcular hypertext documents relating 25 
to one particutar hrypertext document for each of the 
particular hypertext documents, setting the maxi- 
mum importance degree to an importance degree 
for the particular hypertext document for each of the 
particular hypertext documents, deternmning the 30 
ranking of the particular . h^ertaxt documents 
according to those impffltanoe degrees and prepar- 
ing an index of one particular hypertext; document 
for each of the particular hypertext documents, and . 
a retrieval result displaying unit 53 for displaying a 35 
group ot the indexes of the particular hypertext doc- 
uments with the summaries of the particutar hyper- 
text documents in the ranked order determined in 
the document ranking determining unit 52 for each 
particular word as a retrieval result. 40 

In the above configuration, when a Irayword "apple" 
and another keyword 'Aomori" are "mpul l>y tfie user on 
condition that a word "apple" appears in the hj^ertext 
document D83 and a word "Aomori" indicating an apple- *5 
producing prefecture does not appear in the hypertext 
document D83 or D82 but appear in the hypertext doc- 
ument D81 . because a particular word 'dapple" agreeing 
with frie keyword 'apple" ^ears in the hypertext doai- 
ment D83 , the hypatext document D83 is set as a par- so 
ticuiar hypertext document in the retrieving unit 3. 

Thereafter, in the document ranking determining 
unit 52, the particular hyperte>d document D83 and the 
particular parent document D81 are combined to form a 
first combined particular hypertext document, the partic- ss 
ular hypertext document D83 and the particular parent 
document D82 are combined to form a second com- 
bined particular hypertext document, estimated values 
for the combined particular hypertext documents are 



calculated for each of the particidar words, a first sum of 
the estimated v^ue of ttie first combined particular 
hypertext document lor the particular words and a sec- 
ond sum of the esifimatsd value of the second combined 
partictiar hypertext document for the particutar words 
are calculated, tn this case, becaise the particular word 
"Aomori" does not appear in the hypertext document 
D82 but appear In the hypertext document D81 , the first 
sum of the estimated value of the first combined partic- 
ular hypertext document is higher than the second sum 
of the estimated value of the second combined particu- 
lar hypertext document. Therefore, the first contiined 
particular hypertext document is selected, and the first 
sum of the estimated value of frst combined partic- 
ular hypertext document is set as an estimated value of 
the partKular hypertejS docisnent DBS lor the keywords 
"apple" and "Aomori", and an importance degree for the 
particular hypertext document D83 is calculated from 
the estimated value of the particular hypertext docu- 
ment D83. In tie same manner, importance degrees for 
other particular hypertext documents are calculated, 
and the ranldng of the particular hypertext documents is 
detemtined according to the importance degrees. 

Accordingly, even tiiough a plurality of keywords 
expressing a context of a compositon are separately 
used in a hypertext document and a plurality of parent 
documents having a reference relationship with the 
hypertext document, because a combined particular 
hypertext document obtained by combining one particu- 
lar hypertext document and one particutar parent docu- 
m»)t is formed for each of tiia particular parent 
documents and a maximum esfimated value of one 
combined partiaflar hypertext document among those 
of the combined parlicular hypertext documents is set 
as an estimated vdue for the particular hypertext docu- 
ment, there is noprobabiHty that tine particular hypertext 
document is undesirably ranked to a lower class. 

(Sev^h Embodiment) 

A heading portion of a hypertext document nor- 
mally indicates a feature of the hypertext document very 
well. Therefore, to heavily estimate a particular word 
awJearing in the heading portion of the hypertext docu- 
ment, an occun-ence frequency of the particular word 
agreeing with one keyword in the heading portion of frie 
hypertesxt document is doitoled. As an exairple of the 
heading palion, a ttte of ttie hypertext document or an 
anchor sentence of a parent document having a refer- 
ence relationship with the hypertext document is con- 
^^ed in this embodiment. 

Fig. 14 is a block cBagram of a hypertext retrieving 
apparatus according to a seventh en*odiment of the 
present invention. 

As stwwn in Fig. 14. a hypertext retrieving appara- 
feis 61 for retrieving one or more hypertext documents 
likely to meet a user's retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
document managing unit 8, ccHnprises the hypertext 
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document table with parent document list preparing unit 
7, the retrieval index preparing unit 6, the keyword input 
unit 2, the retrieving unit 3, 

a document ranking determining unit 62 for unifying s 
one particular hypertext document and one or more 
partioular parent documenis correqaonding to fte 
particular hypwtext document to a unified particular 
hypertext document according to the document 
information of the hyperteict document table pre- u 
pared by the h^ertext document table with parent 
document list preparing unit 7 for each of the partic- 
ular tiypertext documents obtained in the retrieving 
unit 3, calcutating an occurr«Ke frequency TF of 
one particular word in one unified particular hyper- is 
text documait for each particular word and each 
unified particular hypertext document on condition 
that an occurrence frequency of the particular word 
appearing In an heading portion of the unified par- 
ticular hypertext document is doubled, calculating so 
an inverse document frequency IDF defined as an 
inverse value of the number of particular hypertext 
documents, in which one particular word aRjears, 
for each particular word, calculating a product 
TF*IDF of one occun-ence frequency 7F and one as 
inverse document frequency IDF. summing a plural- 
ity of products for all particular words to produce a 
. .summed product as an estimated value for each 
particular hypertext document, determining a plu- 
rality of importance degrees of the unified particular 30 
hypertext documents according to ttie estimated 
values, detemtining the ranWng -of the particular 
hypertext documente aocortfing to the "mportance 
degrees for the unified particular hypertext docu- 
ments and preparing an index of one partioilar as 
hypertext document for each of the particular 
hypertext documents, and 
a retrieval result cfisplaying unit 63 for displaying the 
indexes of the particular hypertext documents in the 
ranked order determined in the document ranking 4o 
determining unit 62 as a retrieval result- 
In the above configuration, a heading porton of 
each unified particular hypertext document is com- 
posed of a title of one particular hypertext document 4S 
con-esponding to the unified particular hypertext docu- 
ment and one or more anchor sentences of particular 
parent documaits having a reference relabonshp with 
the particular hypertext documwrt. For exanple, in 
cases wtiere a particular word agreeing with one key- so 
word appears six times in one unified particular hyper- 
text document on condition that the particular word 
appears three times in the heading portion of the unified 
particular hypertext document, ihe particular word 
appearing in tiie heading portion of the unified particular ss 
hypertext document is 

douWe-counted each time tiie particular word appears, 
so ttiat an occurrence frequency TF of the particular 
word in the unified particular hypertext document is 



equal to 9. Thereafter, one particular hypertext docu- 
ment corre^nding to tine unified particular hypertext 
document is ranked according to the occurrence fre- 
quency TF=9. 

Accordingly, because the heading portion of the 
hypertext document normally indicates a feature of Ihe 
hypertext document very well and tiie partlrajlar woid 
appaarihg in the heading portion of the unified particdar 
hypertext documait is 

double-counted, reliability fbr the ranking of the particu- 
lar hypertext documents can be moreover heightened. 

In an HTML hypertext document written by ttie 
hypertext mark-up language, a small index is expressed 
by a cdiaracter string surrounded by "<h1 ) " and " (/hi )". 
Therefore, it is applicable that the small index be 
included in the heading portion of the HTML hypertext 
document 

In tills embodiment, the occurrence frequency of 
the particular word appearing in the heading portion of 
tile unHied partiajlar hyperteod document is doubled. 
However, it Is applicable that tiie occun'ence frequency 
of the particular word be increased three or more times. 

^sWh Embodiment) 

In tt)e hypertext documenis of Itie world wide web, 
there is a fecial hypertext document In which a 
number of anchor sentences exist and any other sen- 
tences do not exist This special hypertext document is 
generally called a link page. Even though the link page 
is retrieved and displayed, any useful information meet- 
ing a userls retrieval intention does not exist in the link 
page. Therefore, an occurrence number of a particular 
word in the link page is lowered to zero in tfiis embodi- 
ment 

Fig. 15 is a block diagram of a hypertext retrieving 
apparatus according to an eighth embodiment of the 
present invention. 

As shown in Fig. 15, a hypertext retrieving appara- 
tus 71 for retrieving one or mwe hypertext documents 
likely to meet a user's retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
document managing unit 8, comprises the hypertext 
document taMe with parent document list preparing unit 
7. ttie retrieval index preparing unit 6, the keywad input 
unit S, the retrieving unit 3, 

a document raring determining unit 72 for unifying 
one particular hypertext document and one or more 
partKular pareM- documents corresponding to the 
particular hypertext document to a unified particular 
hypertext document according to the document 
information of the hypertext document table pre- 
pared by the hypertext document table with parent 
document list preparing unit 7 for each of the partic- 
ular hypertext documents obtained in the retrieving 
unit 3. spedfying a link page from among the partic- 
ular hypertext documenis. calculating an occur- 
rence frequency TF of one particular word in one 
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unified particular hypertext document for each par- 
ticular word and each unified particular hypa-text 
document on condition that an occurrence fre- 
quency of the particular word in the Ih* page is 
reduced by one each time the particular word is 5 
found out in the link page treated as one particuiar 
parent document of Qie unified particuiar hypertext 
document, calculating an Irwerse document fre- 
qu^icy IDF defined as ea\ inverse value of the 
number of particular hypertext dooiments. in which 10 
one particular word appears, for each particular 
word, calculating a product TF*IDF of one occur- 
rence frequency TP and one ffwerse documertt fre- 
quency IDF. summing a pluraBly of lawlucts for all 
particular words to produce a summed product as is 
an estimated value for each particular hypertext 
document, determining a frfurality of importance 
degrees of the unified particular hypertext docu- 
ments according to the estimated values, detem^n- 
ing tine ranking of the particula- hypertext go 
documents according to the importance degrees for 
Bie unified particular hypertext documents and pre- 
paring an index of one particular hypertext docu- 
merrt for each of the particular hypertext 
documents, and . 25 

a retrieval reeidt displaying unit 73 for dsplaying the 
irxlexes of the partiadar hypertext documents in the 
ranked order determined in the document ranking 
determiru'ng urnt 62 as a retrieval result. 

In the above configuration, the hypertext document 
DS2 is, for example, a link page relating to the particular 
word "apple" and is composed of fen anchor sentences. 
Therefore, ten reference documents respectively having 
a reference relationship with the hypertext document ss 
D82 exist When an occun-ence frequency of the partic- 
ular word "apple" in a unified particular hypertext docu- 
ment composed of one reference document treated as 
one particulffl- hypertext document and the hypertext 
document D82 treated as one particular parent docu- «? 
ment is calculated, an occurrence frequency of the par- 
ticular word "apple" in the h^sertext dooiment D82 
treated as one particular hypertext document is reduced 
by one each time the particular word "apple" is found 
out in the particular parent document 082. This reduc- 4s 
ing operation is performed for all reference documents 
treated as the particular hypertext documents. 

Therefore, even though the particuiar word "appte"^ 
spears in the hypertext document D82 many times, the ■ 
occurrence frequency of the particular word "apple" in so 
the hypertext document D82 is necessarily reduced to 
zero, and the hypertext document D82 is ranked to the 
lowest class. 

Accordingly, any partksjlar hypertext document 
functioning as one \m page can be always ranked to ss 
the lowest daes. 



{Ninth Embodiment) 

There is a long hypertext document composed of a 
plurality of blocks respectively conresponding to a 
meaning, and a reference label is arranged in tie top of 
each block of the long hypertext documsit. In tfiis 
embodiment, the long hypertext document is divided 
bito the plurality of btocks, and a hypertext document 
table corresponding to each blodt of the long hypertext 
document is prepared. 

ng. 16 is a block diagram of a hypertext retrieving 
apparatus according to a ninth embodiment of the 
present invention. 

As shown in Fig. 16. a hypertext retrieving appara- 
tus 76 lor retrieving one or more hypertext documents 
likely to meet a user's r^iieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
document managing unit 8, comprises 

a hypertext document table with parent document 
list preparing unit 77 for analyzing the hypertext 
documents having the reference relationships 
which are managed by the hypertext document 
managing unit 8, specifying a long hypertext docu- 
ment composed of a plurality of blocks respectively 
corresp(»iding to a meaning, setting each block of 
the long hypertext document as one hypertext doc- 
ument con-esponcfing to one meaning, preparing 
hyp«1ext document information in which one or 
more p^ent document identifiers identifying one or 
more parent dootments and anchor sentences of 
the parent documents are listed with one hypertext 
document identifier identifying one h^ierlext docu- 
ment and a document storing position of the hyper- 
text document, for each of the hypertext 
documents, and preparing a hypertext document 
table of the hypertext document intbrmation for all 
hypertext documents managed by the hypertext 
document managing unit 8, 
the retrieval index preparing unit 6, the keyword 
input unit 2, the retrieving unit 3, the document 
rariking determining unit 4 and tiie retrieval result 
displaying unit 73. 

In the above configuration, as shown in Fig. 17, in 
cases where a long hypertext document D87 composed 
of a plurality of blocks respectivdy corresponding to a 
meaning exists in tiie hypertext documents managed by 
the hypertext document manapng unit 8. the long 
hyp€slext document D87 is spedtied by the hypertext 
document table with parent document list preparing unit 
77, and one or more reference labels respectively 
arranged on the top of one block of the long hypertext 
document D87 are found out. Thereafter, the long 
hypertext documertt 087 is divided into the pluraJity of 
Mocks, and eat^ block of the long hypertext document 
087 is set as one hypertext document 087, 088 or D89. 
In this case, when the user reads a character string 
"ABC" or "XYZ" of an anchor sentence of one hypertext 
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documerrt, the user can immediately refer to the refer- 
ence label such as "#ABC" or "#XYZ- of anottier hyper- 
text document. Thereafter, a hypertext document table 
of the hypertext document tnformation for all hyperted 
documente is prepared in the same manner as niihe 5 
first embodiment 

Accordingly, even Wwugh a tono hypertext docu- 
ment omposed of a plurality of blodts respectively cw- 
respondlng to a meaning exists in the hypertext 
documents, because the long hypertext document is 10 
divided into the blocks and each block of the long hyper- 
text document is set as one hypertext document to pre- 
pare the hiTiertext document information for each tAick 
of the long hypertext document, the hypolext docu- 
ments respectively relating to one meaning can be ib 
ranked, so that the user can easily retrieve a group of 
hypertext documents likely to meet his request. 

In this embodiment, in cases where a smdi index 
expressed by a character string surrounded by "(hi >" 
and "(/hi ) " is used in a long hypertext document, it is 20 
applicable that the long hypertext document be divided 
info a plurality of blocks on condition that one reference 
l^el or one small index is arranged on the top of each 
block. 

2S 

(Tenth Embodiment 

In cases where the user intends to again retrieve a 
plurality of hypertext documents by changing an initial 
keyword to another keyword which relates to a plursdity so 
of particular hypertext documents displayed according 
to the initial keyword, the user generally desires to 
actoiowledge one cm- more words f requenfly appearing 
in the particular hypertext documents. Therefore, in this 
embodiment, one or more words frequently appearing 35 
in the particular hypertext documents are displayed. 

Fig. 18 is a block diagram of a hypertext retrieving 
apparatus according to a tenth embodiment of the 
present invention. 

As shown in Fig. 18, a hypertext retrieving appara- *> 
tus 91 for retrieving one or more hypertesd documents 
likely to meet a user's retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
document managing unit 8, comprises 

45 

the hypertext document table with parent document 
list preparing unit 7. the retrieval index preparing 
unit 6, the keyword input urft 2, the retrieving unit 3, 
a document ranWng determinNig unH 92 for unifying 
one particular hypertext document and one or more so 
particular parent documents corresponding to the 
particular hypertext document to a unified particular 
hypertext document according to the document 
information of the hypertext document table pre- 
pared by the hypertext document table with parent ss 
document list preparing unit 7 for each of the partic- 
ular hypertext documents obtained in the retrieving 
unit 3, calculating an occiarence frequency TF of 
one particular word In one unified particular hyper- 



te)rt document for each particular word and each 
unified particular hypertext document, calculating 
an inverse document frequency IDF defined as an 
inverse value of the number of particular hypertext 
documents, in which one particular word appears, 
for eadi particular word, calculating a product 
TF'IDF of one occurrence frequency TF and one 
inverse document frequency IDF. summing a plural- 
ity of products for aN particular vrords to produce a 
summed product as an estimated value for each 
particular hypertext document, determining a plu- 
rality of importance degrees of the unified particular 
hypertext documents according to the estimated 
values, determining the ranking of the particular 
hypertext documents according to ttie importance 
degrees for the unled particular hypertext docu- 
ments, preparing an index of one particular hyper- 
text document for each of the particular hypertext 
documents, selecting a plurality of high-ranking 
hypertext documents from the particular hypertext 
documents, extracting a plurality of related words 
listed in a plurality of word lists of pieces of hyper- 
text document information of the hypertext docu- 
ment table con-esponding to the high-ranking 
hypertext documents, calculating an ocojrrence 
frec^jency TF of one related word in one high-rank- 
ing hypertext document for each related worcf and 
each high-ranking hypertext document, catoutafing 
an inverse document frequency IDF defined as an 
inverse value oif the number of high-ranking hyper- 
text documents, in which one related word appears, 
for each related word, calculating a sum of a plural- 
ity of products TF*IDF for all high-ranking hypertext 
documents to produce a summed product as an 
importance degree for each related word, compar- 
ing the importance degrees of the related words 
with each other, selecting a plurality of high-ranking 
related words of which the importance degrees are 
higher than those of other related words, and pre- 
paring a hypertext mark-up language {HTML) docu- 
ment in wtwch a plurality of keyword selection 
txjttons corresponding I0 the high-ranWr^ related 
words are arranged in the decreasing order of the 
importance degrees of the high-ranking related 
words to select one high-ranking related word by 
pushing one keyword selection button, and 
a retrieval result displaying unit 93 for displaying the 
indexes of the p^ticular hypertext documents in the 
ranked order determined in the document ranking 
determining unit 92 as a retrieval result on a result 
displaying window W1 and displaying the HTML 
document pr^ared by the doament ranking deter- 
mining unit 92 on a high-ranking related worcf 
selecting window W2. 

in the above configuration, in cases where the tenth 
embodiment and ttie tiiird embodiment are corrtjined. 
as shown In Fig. 1 9, whwi a keyword "apple" is input to 
the keyword input unit 2, a plurality of indexes of partic- 
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ular hypertext documents such as documents D83, D85 
and D86 and a plurality of indexes of parent documents 
such as documents D80 and D81 are, for example, dis- 
played on the result displaying window W1 in the same 
manner as in the third embodiment. Thereafter, in the 
document ranking determirang unit 92, ten high-ranking 
hypertext documents are selected from the particular 
hypertext documents, a piuraEty of related words listed 
in a plurality of word lists of pieces of hypertext docu- 
ment information of the hypertext document table corre- 
sponding to the high-ranking hypertext documents are 
extracted, a aim of a plurality of products TF*IDF for all 
Ngh-ranking hypwtext documents is calculated for each 
r^ated word, and imporunoe degrees for the related 
woids are determined. Thereafter, ten high-ranking 
related words "Shm^", "farmer", "produd". "Aomori', 
"manure", "farm", "festival", "Nebuia". "Nagano" and 
"Olympics" are selected from the related words, an 
HTML document in which ten keyword selection buttons 
corresponding to the high-ranking related words are 
arranged in the decreasing order of the importance 
degrees of the high-ranking related words is prepared, 
and the HTML documerrt is displayed on the high-rank- 
ing related word selectnng window W2. 

Therelore, when flie user push the keywoitl button 
cOT-esponding to the higih-rarWng related word "Shin- 
6hu". the word "Shinshu" indicating an apple-producing 
district is input to the keyword input imit 2 as a keyword, 
importance degrees of a plurality of particular hypertext 
documents con-esponcfing to Bie keyword "Shinshu" are 
determined, and the particular hypertext documents 
arranged in the decreasing order of the importance 
degrees are dis(^ayed on the result displa^nng window 
W1 in the san»e manner as in the first embodiment. 

AccorcHngly, even though the user cannot initially 
bring an appropriate keyword to his mind, the user can 
seitecl one or more keywotxis closer to his retrieval 
intention. Also, the user can change his retrieval inten- 
tion Ijy referring to the high-ranking related words, and a 
plurality of particular tvperteixt documenls correspond- 
ing to a new keyword selected by the user according to 
Ns new retrieval intention can be displayed. 

tn this case, the user can pu^ the keyword selec- 
tion button by using a pointing device without usftig a 
keyboard. Also, the ksyword selection tiuttons are 
embodied by operating a JAVA script in which the high- 
ranldng r^ated words are added to a text box, a "clear" 
button is en:^died by operating a JAVA script in which 
one high-ranking related word added to the text box Is 
cleared, an "initial condition' button is embocNed by 
operating a JAVA script in which the high-ranking 
related words added to the text box are returned to an 
initial group of keywords such as "apple", and an "re- 
retrieval" button is embodied by operating a JAVA sanpt 
in which a retrieval csperatioi is again operated by uang 
one or more words added to the text box as one or more 
keywords. 

tnthis embodiment, the hi{^-ranking hypertext doc- 
uments are selected from the palicular hypertext docu- 



mente. However, it is applicaWe that the high-ranking 
hypertext documents be selected from the particular 
hypertext documents and the parent documents. In this 
case, a plurality of related words can be widely collected 
s from a plurality of hypertext documents having a refer- 
ence relaliorjship vwth each other. 

(Eleventh Embodiment) 

10 tn the tenth embodiment, the importance degrees 
of the related words are determined without any con- 
nection with the keyword initially input by tiie user. How- 
ever, in cases wrtiere the user desires to select related 
word having a close conr^ation vrith the keyword, it is 

IS preferred thai a related word having a dose con-dation 
with a keyword be preferentially selected as a high- 
ranking related word. Therefore, in this embodiment, an 
occurrence frequ^icy of a related word having a close 
correlation with a keyword is doubled to heigihten an 

20 in^ortance degree Of the related word. 

Rg. 20 is a Modt diagram of a hypertext retrieving 
apparatus according to an eleventh embodiment of the 
presait invention. 

As shown in Fig. 20, a h^jertext retrieving appara- 

ss tus 1 01 for retrie\ring one or more hypertext documents 
likely to meet a users retrieval request from a large vol- 
ume of hypertext documents stored in the hypertext 
document managing unit 8, comprises 

30 the hypertext document taMe with parent document 
list preparing unit 7, the retrieval index preparing 
unit 6, the keyword ir^ut unit2,theretrievir)gunit3,' 
a documait ranking determining unit 102 for unify- 
ing one particular hyperteixt document and one or 

35 more particular parent documents corresponding to 
the particular hypertext document to a unified par- 
ticular hypertext document according to the docu- 
ment informatton of the hypertext document table 
prepared by the hypertext document table witti par- 

40 ent dOGum^ list preparing unit 7 for each of the 
palicular hypertext documenls obtained in the 
retrieving unit 3, calculating an occurrence fre- 
quency IF of one particular word in one unified par- 
ticular hypertext document for each particular word 

45 and each unified particular hypertext document, 
calculating an inverse document frequency IDF 
defined as an inverse value of the number of partic- 
ular hypertext documents, in whkJi one particular : 
word appears^ f6r each particular word, c^cutaiing 

so a product TF*IDF of one occurrence frequenoy TF : 
and one inverse document frequency IDF, summing 
a plurality of products for all particular virads to pro- 
duce a summed product as an estimated value for 
each particular hypertext document, determining a 

55 plurality of inportance degrees of the unified p^fic- 
tilar hypertext documents according to the esti- 
mated values, determining the ranking of the 
particular hypertext documents according to the 
importance degrees for the unified partKular hyper- 
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text documents, preparing an index of one particu- 
lar hypertext document for each of the particular 
hypertext documents, selecting a plurality of high- 
ranking hypertext ctocuments from the particular 
hwJertext documents, extracting a plurality of 5 
related words listed in a plurality of word lists of 
pieces of hypertext document information of the 
hypertext document table corresponding to the 
high-ranking hypertext documents, cairajlating an 
occurrence frequency TF of one related word in one w 
high-ranWng hypertext document for each related 
word and each high-ranking hypertext document on 
condHion that the r^ted word is double-counted 
when the related word is pJaced witHn a distance of 
40 letters from one keyword, calculating an inveree ,5 
document frequency IDF defined as an inverse 
value of the number of highnanking twertext doc- 
uments, in which one related word appears, for 
each related word, calculating a sum of a plurality of 
products TF*IDF for all high-ranWng hypertext doc- so 
uments to produce a summed product as an impor- 
tance degree for each related word, comparing the 
importance degrees of the related words with each 
other, selecting a plurality of high-ranking r^ated 
words ol whidi the importance degrees are higher 2s 
than those of other related words, and preparing a 
hypertext mark-up language (HTML) document in 
. which a plurality of keyword selection buttons corre- 
sponding to the high-ranking related words are 
, arrffiiged in the decreasing <x6& of the importance 30 
degrees of the high-ranking related words to select 
one high-ranking related word lay pushing one key- 
.woid selection button, and 
.a retrie/al result displaying unit 103 for displaying 
the indexes of the particular hypertext documents in 3S 
the raiked order determined in the document rank- 
ing determining unit 92 as a retrieval result on a 
result displaying window W1 and displaying the 
HTIML document prepared by the document rank- 
ing determining unit 102 on a Ngh-ranking related 40 
word selecting window W2. 



In the above configuration, after the related words 
are extracted ia the same manner as in the tenth 
ent>odiment, an occun-ence frequency TF of one is 
related word in one high-ranking hypertext document is 
calculated for each related word and each high-ranking 
hypertext document. In this case, when the related word 
is placed within a distance of 40 letters from one key- 
word "apple", the related word is double-counted, so 
Therefore, because the related word 'Shinshu" indicat- 
ing an apple-producing district or the related wwd 
"farmer" often appears within a distance of 40 letters 
from one keyword "apple" and t>6cause me related word 
"Nagano" indicating an apple-producing prefecture or ss 
the related word "Olympjcs" indicating a festival held in 
the Nagano in 1998 is hardly appears within a distance 
of 40 letters frtMn one keyword "apple", as shown in Fig. 
21, the related words "Shinshu" and 'lamwr" are relia- 



bly di^layed on the head portion of the high-ranking 
related word selecting window W2. and the related 
words 'Nagano" and "Olympics" are displayed on the 
rear portion of the high-ranking related word selecting 
window W2 even though the related words "Nagano" 
and "Olympics" frequently appear in the particular 
hypertext documents. 

Accordingly, one or more related words having a 
strong relationship with the keyword can be displayed in 
high-ranking positions, and one or more related words 
corresponding to a usefs retrieval intention differing 
from the Initial retrieval intention can be displayed in 
tow-ranking positions. 

Having illustrated and described the principles of 
the present invention in a prefen-ed embodiment 
thereof, it should be reacfiiy apparent to those skilled in 
the art that the inventirai can be modified in arrange- 
ment and detail without departing from such p-inciples. 
We claim all mocffications coming within the scope of 
the aocompffljying claims. 

Claims 

1. A^hypeilext document retrieving apparatus for 
refrieving a plurality of parBcular hypertext docu- 
ments ia<ely to meet a user's refrieval request from 
a group of hypertext documents having reference 
relationships with each other in whk:h one hypertext 
document having an anchor sentence functions as 
a parent document for another hypertext document 
functioning as a reference document and a user 
refers to 'one reference document after the user 
selects one anchor sentence of one parent docu- 
ment corresponding to the reference document, 
conprising: 

hypertext document table preparing means for 
preparing hypertext document information, in 
which one hypertext document identifier identi- 
fying one hypertext document, a body of the 
hypertext documeit, a parent document identi- 
fier identifying a parent document con-espond- 
ing to the hypertext document functioning as 
one reference document and an anchor sen- 
tence of the parent document are registered, 
for each of the hypertext documents and pre- 
paring a hypertext document table of the hjper- 
text document information for the hypertext 
documents; 

refrieval index preparing means for recognizing 
a plurality of worcis appearing in each of the 
hMDertext documents and ttie parer* docu- 
ments according to the h^jertext document 
table prepared by the hypertext document table 
preparing means, recognizing a plurality of 
occurrence positions of the words in each of 
the hypertext documents and the parent docu- 
ments according to «ie hypertext document 
fable, prepaingword information, composed of 
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one or more occurrence document identifiers 
idenfifying one or more hyperte)ct dtocuments in 
which one word af^ears and oocwrence posi- 2. 
tions of the word in the hypertext documents 
and one or more anchor sentences of one or 5 
more parent documents corresponding to the 
hyperttsct documents, for each of the words, 
and pr^arir»g a r^rieval index of pieces of 
word information for the words; 
keyword receiving means for receiving a key- 10 
word indicating the user's retrieval request; 3. 
retrieving means for retrieving particular word 
informatiai corresponding to the keyword 
received by the keyword receiving means from 
the retrieval index prepared by the retrieval 15 
index preparing means and retrieving a plural- 
ity of particular occurrence document identifi- 
ers tdenfifying a plurality of particular hypertext 
documents in which the keyword appears and 
a plurality of particular occurr«ic6 positions of so 
the keyword in the particular hypertext docu- 
ments and one or more particular anchor sen- 4. 
tences of one or more particular parent 
documents corresponding to Una particular 
hypertext documents from the particular word as 
informaiim; 

document ranking determining means for spec- 
ifying the particular hypertext documerrts which 
are identified by the parficular occun-eiwe doc- 
ument identifiers retrieved by the retrieving 30 
means, retrieving pieces of particular hypertext 
document information for the particular hyper- 
text documents from the hypertext document 5. 
table prepared by the hypertext document table 
preparing' means, unifying one particular ss 
h^ertext document and one or more particular 
parent documents corresponding to the partic- 
ular hw)ertext document to a unified hypertext 
document for each of the particular hypertext 
documents, cateuiating an occurrence fre- 40 
quenoy of the keyword in one unHied hypertext 6. 
document for each unified hypertext document 
determining a plurality of importance degrees 
of the unified hypertext documents acccwding 
to the occurrence frequencies in the unified 45 
hypertext documents, setting one importance 
degree of one unified hypertext document as 
an importance degree of one particuSar tiyper- 
text document corresponding to the ur«fied 
' hypertext document for each unified hypertext so 
document and determining the ranWng of the 
particular hj^jertext documents according to 
the importarvce degrees of the particular hyper- 
text documents; and 7. 
retrieval result displaying means for displaying 55 
a plurality of indexes of the particular hypertext 
doojments in a ranked order corresponding to 
the ranking of the particular hypertext docu- 
ments detemmed by the document ranking 



determining means as a retrieval result. 

A hypertext document retrieving apparatus accord- 
ing to claim 1 in which an index of one particular 
parent document corresponding to one particular 
hypertext document is displayed with the index of 
the particular hypertext document by the retrieval 
result displaying means for each of the parlicular 
hypertext documaits. 

A hypertext document retrieving apparatus accord- 
ing to claim 1 in which a plurality of particular hyper- 
text documents corresponding to the same 
particular parent document are reset to the same 
rank as a highest rank among the ranks determined 
for the particular hypertext documents by the docu- 
ment ranking determining means, and the particu- 
lar hypertext documents set to the same ranksre 
displayed with the particular parent document in a 
group by the retrieval result displaying means. 

A hypertext document retrieving apparatus accord- 
ing to daim 1 in which a plurality ol particular hyper- 
text documents con'espondng to the same 
partk»lar parent document are reset to a same 
rank according to a sum of the importance degrees 
for the particular hypertext documents by the docu- 
ment ranking determining means, and the particu- 
lar hypertext documents set to the same rank are 
displayed wttii the particular parent document in a 
group by the retrieval result displa^ng means, 

A hypertext document retrievifig apparatus accord- 
ing to claim 1 in which each of the unif led hypertext 
documents is formed by the document ranking 
determining means by unifying one or more anchor 
sentences of one or more particular parent docu- 
ments corresponding to one particidar hypertext 
document and the particular hypertext document. 

A hypertext document retrieving apparatus accord- 
ing to daim 1 in whidi a particular sentence or a 
particular pivas& induding the keyword is extracted 
from eadi of tiie particular hypertext documents by 
the document ranking determining means, and a 
surrwriary in which one particular sentence or one 
particiiar phrase of one particula" hypertext docu- 
ment is written in succesaon to a top sentence or a 
top phrase of the particular hypertext document is 
displayed with the index of the particular hypertext 
document for each of the particular hypertext docu- 
ments. 

A hypertext document retriei/ing apparatus accord- 
ing to claim 1 in which tiie importance degree of 
each of the unified hypertext documents is deter- 
mined by the document ranking determining means 
fay calculating a sum of an occurrence frequency of 
the keyword in one hypertext doomient sani an 
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occurrence frequency of the keyword in one parent 
document corresponding to the hypertext docu- 
ment for each of the parent documents correspond- 
ing to the hypertext document, selecting a 
maximum sum among the sums for ttie parent doc- 5 
uments, specifying one particular parent document 
corresponding to the maximum sum, determining 
one importance degree for a combination of the 
hypertext document and the particular parent docu- 
ment according to the maximum sum and regarding jo 
the importance degree as one ImporlarKB degree 
of one unified h^ertextdocumwit corresponding to 
the hypertext document 

8. A hypertesd document retrieving apparatus accord- is 
ing to claim 1 in which the occurrence frequency of 
the keyword in each unified hypertext document Is 
calculated by the document ranWng determining 
means by double-counting the keyword aRjearing 

in one or mwe anchor sentences of one or more 20 
panicutar parent documents conre^nding to the 
unified hypertext document 

9. A hypertext docunent retrieving apparatus accord- 
ing I0 claim 1 in which the occurrence frequency of ss 
the keyword in one hypertext document f imctionlng 

. as a link page composed of one or more aiohor 
- sentences is set to zero by the document ranking 
d^ermining means. 

30 

I. 0. vA ttypertext documerrt retrieving apparatus accord- 

ing to daim 1 in whidi one tjypertaxt document hav- 
ing contents corresponding to a plurality of 
meanings re^ectivaly identified by a reference 
label is divided into a plurality of blocks by the 35 
hypertext document table preparing means to 
include one reference label in a tc^) of each block, 
and one hypertext document information is pre- 
pared for ea(*i dock of the hypertecl document by 
the hypertext document table preparing means. 40 

I I. A hypertext document retrieving apparatus accoid- 
ing to claim 1 in which a predetermined numb^ of 
high-ranking particular hypertext documents are 
selected from among the particular hypertext docu- 4S 
ments by the docurrient ranking determining 
means, a plurality of related words appearing in the 
high-rankiiTg particular hypertext documents are 
extracted from the high-ranking particular hypertext 
documents by the docisnent ranking determiniflg so 
means, a plurality of inrpjrtance degrees of the 
related words are calculated from a plurality of 
occurrence frequencies of the related words in the 
high-ranking particular hypertext docum^ by ttie 
document ranking determining means, a predeter- ss 
mined number o* high-ranlang related words are 
selected from tfie related words ranked accorxfing 

to the impwlanoe de^ees of the related words by 
the document ranking detemiining means, and a 



plurality of selection buttons tor fte high-ranking 
related vwsrds are displayed with the indexes of the 
particular hypertext documents by the retrieval 
result displaying means- 

12. A hypertext document retrieving apparatus accord- 
ing to claim 1 in which a predetermined number of 
high-ranking particular hypertext documents are 
selected from among the particular hypertext docu- 
ments by the document ranking determining 
means, a plurality of related words appearing in the 
high-ranWng particular hypalext documents and a 
pluralfty of particular parent documents corre- 
sponcEng to the high-ranking particular hypertext 
documents are extracted from the high-ranking par- 
ticular hypertext documents by the document rank- 
ing determining means, a plurality of importance 
degrees of the related woitis are cateulated from a 
pJurality of occurrence frequencies of the rdated 
words in the high-ranking particular hypertext docu- 
ments and the particular parent documents by the 
document ranteng determining means, a predeter- 
mined number of high-ranking related words are 
selected from ttie related woids ranked according 
to the importance degrees of the related words by 
the document ranking determining means, and a 
plurality of selection buttons for the high-ranking 
related words are displayed with the indexes of the 
partkjular hypertext documents by the retrieval 
result displaying meansL 

13. A hypertext document retrieving apparatus accord- 
ing to daim 1 in wNch a predetermined number of 
high-ranking particular hypertext documents are 
selected from among the particular hypertext docu- 
ments by the document ranking determining 
means, a plurality of related words appearing in the 
high-ranking particular hypertext documents are 
extracted from the high-ranking particular hypertext 
documents by the document ranking determining 
flieans, an occurrence frequency of each r^ated 
word in the high-ranking particular hyperteixt docu- 
ments is calculated by the document ranking deter- 
mining means on condition that the related word 
appearing in one high-ranking particular hypertext 
document is double-counted in cases where an 
occunence position of the related word is near to an 
oca^ence position of the keyword, a plurality of 
Importance degrees of the related words are calcu- 
lated from the occurrence frequencies of the related 
words by the document ranWno determining 
means, a predetermined number of high-ranking 
related words are selected from ihe related words 
ranked accoixlirig to the importance degrees of the 
related words by the docu mem ranking determining 
means, and a plurality of sefection buttons for the 
high-ranking related words are displayed with me 
ffidexes of the particular hypertext documents by 
the retrieval result diaplajring means. 
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14. A hypertext document retrieving apparatus accord- 
ing to claim 1 in whicli a predetermined nuittier of 
high-ranking particular hypertext documents are 
selected from among the particular hypertext docu- 
ments by the document ranking determining s 
means, a plurality of related words appearing in 6ie 
hi^-rankBig particular hypertext documents and a 
plurality of particular parent documents coirre- 
sponding to the high^Bnking parScutar hypertext 
documents are extracted from the high-ranking par- jo 
ticutar hypertext documents by the document rank- 
ing determining means, an occurrence frequency of 
each related word in the high-ranking particular 
hypertext documente and the particular parent doc- 
uments ie calOJlatGd by the document ranking js 
determining means on condition that the relaited 
WOTd appearing in one high-ranking particular 
hypertext document or one particular parent docu- 
ment is double-counted in cases where an occur- 
rence position of the related word is near to an so 
occurrence position of the keyword, a plurality of 
importance degrees of the refated words are calcu- 
lated from the occurrence frequencies of the related 
words by ttie document ranking determining 
means, a predetermined nunniber of hi^-ranking 
related words are selected from the related words 
rard<Bd according to the in^XHrt^e de^ees of the 
related words by the document ranking detemilnng 
means, and a plurality of selection buttons for the 
high-ranking related words displayed with the 30 
indexes of the particular hypertext documents by 
the retrieval resutt displaying means. 

15. A hypertext document r^ewng apparatus accord- 
ing to claim 1 in which a plurdity of keywords are ss 
received by the keyword receiving means, an 
occurrence frequency TF of one keyword in one 
unified hypertext document is calculaited by the 
document ranking determining means for each key- 
word and each unified hypertext documert, an w 
inverse document frequency IDF defined as an 
inverse value of the nianber of particular hypertod 
documents in which one keyword appears is calcu- 
lated by the document ranking determining means 

for each keyword, a product TF'IDF of one occur- 45 
rence frequency TF and one inverse document fre- 
quency IDF is calculated by the document ranking 
determining means, a plurality of products for the 
keywords are summed by the document ranking 
determining means to produce a summed product so 
as an esbmated value for each unified particular 
hypertect docim>«it, and the inportance degrees 
of the unified hypertext documents are determined 
according to the estimated values by the document 
ranking determining means. ss 

16. A hypertext document r^ie«ng apparatus accord- 
ing to daim 1 5 in wWch one estimated value for one 
unified particular hypertext document is increased 



to heighten the rank of the particular hypertext doc- 
ument in cases where two or more keywords 
appear in the unified particular hypertext document 
or a distance of two keywwds in the unified partteu- 
lar hypertext document is within a predetermined 
niumt>er of words. 
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